Which demographic processes control competitive equilibria? Bayesian calibration of a size‐structured forest population model

Abstract In forest communities, light competition is a key process for community assembly. Species' differences in seedling and sapling tolerance to shade cast by overstory trees is thought to determine species composition at late‐successional stages. Most forests are distant from these late‐successional equilibria, impeding a formal evaluation of their potential species composition. To extrapolate competitive equilibria from short‐term data, we therefore introduce the JAB model, a parsimonious dynamic model with interacting size‐structured populations, which focuses on sapling demography including the tolerance to overstory competition. We apply the JAB model to a two‐“species” system from temperate European forests, that is, the shade‐tolerant species Fagus sylvatica L. and the group of all other competing species. Using Bayesian calibration with prior information from external Slovakian national forest inventory (NFI) data, we fit the JAB model to short time series from the German NFI. We use the posterior estimates of demographic rates to extrapolate that F. sylvatica will be the predominant species in 94% of the competitive equilibria, despite only predominating in 24% of the initial states. We further simulate counterfactual equilibria with parameters switched between species to assess the role of different demographic processes for competitive equilibria. These simulations confirm the hypothesis that the higher shade tolerance of F. sylvatica saplings is key for its long‐term predominance. Our results highlight the importance of demographic differences in early life stages for tree species assembly in forest communities.

The forested areas in Slovakia are mostly part of the European temperate vegetation zone and have similar potential natural beech and mixed beech forests (Bohn and Gollub, 2006) and current species composition (Tables S3-S4) as in Germany. The selected forest plots in Slovakia have F. sylvatica as the predominant species with about 28% of the basal area.
In the Slovakian NFI, full tree counts and circumference measurements were conducted on 500 m 2 fixedarea plots that are arranged on a regular quadratic grid (Šebeň, 2017). The lower size threshold for sampling trees is dbh 7 cm. Smaller seedlings and saplings with height 10 cm and greater were sampled on varying circles with increasing areas, i.e. 3.14, 7. 07, 12.57, 28.27, and 50.26 m 2 . The sampling area for seedlings and saplings was adapted depending on field estimates of the tree density so that at least 15 trees were counted.
To infer seedling recruitment rates from the Slovakian NFI, we selected only plots without records of unnatural regeneration. This reduced the number of plots from 1412 to 1402 (Table S1).

Appendix A.2 Accounting for varying sampling areas with offsets Appendix A.2.1 Offsets for the angle count method
Count data from the angle count method (size stage A in the German NFI) were related to the count states in the JAB model by scaling with area offsets. In angle count sampling, each tree has a specific sampling area a dependent on its dbh with a = π k 2 dbh 2 , where k is a constant for the sampling angle (here, k = 25). Thus, for the sampling area of a plot, which corresponds to the actually observed trees on a plot we use the weighted To relate count data from the angle count method to the basal area state in the model (size stage B in the German NFI), we used an offset factor that not only includes the sampling area a p but transforms the basal area per hectare in the JAB model to counts. This was achieved with an offset o that also included the total observed basal area ba p on a plot: o = a p c p ba p [ha m −2 ] . By multiplying the model statex [m 2 ha −1 ] with this offset we transformed the model state to counts, by including the data on basal area ("how many counts are one unit of basal area?") on the right hand side of the model statement: x ∼ a p c p ba px .

Appendix A.2.2 Offsets for counts on fixed-area plots
Counts on fixed-area plots (seedlings in the Slovakian NFI and saplings J in the German NFI) were also standardized using area offsets. In the statistical model inferring priors for the seedling recruitment rates r from the Slovakian NFI data (see Section Appendix A.4), we used areas of the corresponding subplots as offsets. In the German NFI data, the size class J of the JAB model consists of multiple size classes with different sampling areas, which also change between surveys (Table S2). As in the angle count data, we calculated a weighted mean area per plot, where the different sampling areas were weighted by the counts per area within the corresponding size classes. The weighted mean area per NFI plot was used as an offset to relate the sapling counts to the area-standardized counts of J in the JAB model [ha −1 ].

Appendix A.2.3 Offsets for zero observations
Because all NFI sampling protocols made observation areas dependent on size or abundance (see Section Appendix A.1.2 and Appendix A.1.1), the observation area was not readily available when there were no observations for a stage on a plot. For zero observations in the angle count data in the German NFI (size classes A and B), we used the sampling area of the other species, or, if both species had zero observations, the mean sampling area per size class and survey as an offset (for a test of the robustness of this approach, see below). To be able to compute a mean area for the largest size class B, we truncated the angle count data to a maximum sampling radius of 14 m. The truncation included dropping all trees outside the sampling radius, which is about the 98th percentile of tree counts in A and B and assigning the new maximum sampling area to all trees that were within the maximum radius but had a dbh that was originally corresponding to a sampling area greater than the new maximum area (dbh > 60 cm). Truncating the angle count data to a fixed maximum area removes a source of bias (missed tree observations at higher radii) at the cost of moderate added variance (Berger et al., 2020), and furthermore leads to a finite sampling area that can be averaged over. The mean sampling area per size class (A and B) was calculated by dividing the definite integral from 0 up to the respective maximum radius-by the maximum radius. This mean area was used as an offset for zero observations in size class A. To obtain an offset for zero observations in size class B, we also needed an additional factor that scaled the basal area to the scale of observed counts. Instead of multiplying the area with the plot-specific factor c p ba p (as for non-zero observations, see Section Appendix A.2.1), we included the mean of the factor per survey and additionally per species to differentiate between their average basal areas.
We made sure that the choice of the particular area offset for zero observations did not affect the relations of the parameter estimates, nor the conclusions regarding the extrapolated equilibria, nor the importance of the "shading" parameter s, by running the entire analysis pipeline with three alternative choices for the sampling area of zero observations: the count per hectare-weighted mean of all observation areas per survey and species, as well as the analogous first quantile and the third quantile (weighted quantiles from package Desc-Tools version 0.99.45; Signorell, 2021).
Zero observations in the sapling counts of the size class J in the German NFI were assigned an equally weighted mean of all possible sampling areas for the respective set of smaller size classes within the corresponding survey (Table S2).
For zero observations in seedling counts in the Slovakian NFI there were two possible cases: If only one of Fagus and others had no seedling observation on a plot in a year, the zero observation was assigned the sampling area from the respective other species. If there was no seedling observation at all, we used the maximum possible area (50.26 m 2 ), because the sampling protocol prescribed increasing the subplot area until sufficient seedling density was reached (see Section Appendix A.1.2).

Appendix A.3 Species' regional abundance for predicting seedling input
To derive the regional species abundance B p used to estimate the external seedling input (parameter l , see Section ??) we used thin plate spline regressions with the geographic coordinates as predictors to interpolate the species-specific basal area per hectare on the German NFI grid (Table S6, Figure S1).
For fitting the thin plate splines from the regular standard grid of the German NFI (4 km; see Section Appendix A.1.1), which is homogeneous across Federal states, we selected only the basal area records from the last survey (2012), because the grid had changed over surveys. We completed non-forested clusters of the regular grid, and also all of the four plots within clusters, with zero observations to obtain an unbiased sample of the geographical density of tree species. To obtain a response variable for each set of coordinates, we calculated the species abundances per cluster by averaging the basal area per hectare above the sampling threshold (BA) over the four plots. This average basal area per hectare on the completed grid was spatially smoothed with a twodimensional geographic thin plate spline. Based on the coordinates, a "spline on sphere" was fitted with the gam() function from the package mgcv (k = 200; version 1.8-39; Wood, 2021), where the average basal area per hectare was the response. The response variable had been rounded to an integer to be able to fit a negative binomial response. The predicted plot-specific regional species abundances B p were used as a proxy for the probability of seedling input, which factors into the JAB model with a species-specific effect size l , L p = B p l .
By using a strictly positive negative binomial response for B p , it was made sure that plots were not excluded from invasion in the JAB model.

Appendix A.4 Prior inference on seedling recruitment
To infer priors for the rate r , i.e. seedling recruitment by the local basal area, we fitted a regression model with the data from the Slovakian NFI (Table S5). The Slovakian NFI includes counts of small seedlings with height 10-20 cm, which is right below the size class J in the German NFI (height 20 cm to dbh 7 cm, Section ??). The density of the small seedlings on a plot was used as a proxy for the number of seedlings having appeared in a year on a plot. Although the size class height 10-20 cm neither completely covers yearlings nor excludes older or younger trees, data on seedling growth suggests that this size class can be a rough approximation for yearling count density: The common conifer Pinus sylvestris L., e.g., reaches average height 20.0 cm after two years (Geudens et al., 2004), Fagus sylvatica and Quercus robur L. both have an average height between 10 and 15 cm after one year (Welander and Ottosson, 1998). As a predictor of yearling count density we used the conspecific basal area per hectare above the sampling threshold (dbh ≥ 10 cm).
We fitted a joined hurdle regression model for Fagus and others, where the species-specific yearly seedling recruitment rate r is a function of the local and conspecific basal area B A. All other yearly seedling input, i.e.
appearing seedlings due to long-distance or temporal dispersal, is included as an intercept k. The hurdle model is implemented as a two-component mixture where the probability density D is the sum of a Bernoulli distribution and a zero-truncated Negative Binomial distribution (ZTNB), parameterized with mean and precision φ, such that:Ŝ ps = k s + r s BA ps (S1) where S ps are the observed andŜ ps the predicted seedling density, which are a function of intercept k and conspecific basal area B A ps with slope r , and where variables vary with plot p and species s and plots can have one of the sampling area levels a. The Bernoulli density d p describes the probability of observing no seedlings, which has an inverse logit-transformed linear predictor dependent on the latent total seedling density on a plot across speciesΣ p with slope m, but also on the factor sampling area (parameter θ a ). The hurdle model with a separate process for zero observations was chosen to account for the deflated zeroes with small sampling areas and inflated zeroes with great sampling areas. A deflation of zeroes at smaller sampling areas that could not be explained by a Negative Binomial distribution was likely a product of the sampling protocol, where the sampling area was increased in steps when the total seedling density was too small, depending on field estimates (see Section Appendix A.1.2). Because the total seedling densityΣ p is included in the linear predictor for the probability of observing a zero d p , other than in a classical hurdle model, the zeroes in the data also inform the predicted seedling densityŜ ps . The zero-truncated Negative Binomial distribution, which describes the non-zero counts, is parameterized with meanŜ p multiplied by an offset for the sampling areaÂ a and precision φ a , which also varies with the levels of sampling area.
The model was fitted with weakly informative, normally-distributed priors for log-transformations of the parameters of interest log r ∼ N (5, 10), log k ∼ N (5, 10)-and regularizing priors θ ∼ N (0, 2), m ∼ N (0, 2), and the half-normal 1 $ φ ∼ H (1). The resulting posterior distribution of the basal area-dependent seedling recruitment rate log r was propagated as a prior for the corresponding parameter in the JAB model (Section ??; Tables S5 and ??).

Appendix B Code repository
All software to reproduce the study, including the JAB model, has been made openly available at https://do i.org/10.5281/zenodo.8032461.